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Abstract — We consider the optimal duration allocation in a 
decision making queue. Decision making tasks arrive at a given 
rate to a human operator. The correctness of the decision made 
by human evolves as a sigmoidal function of the duration 
allocated to the task. Each task in the queue loses its value 
continuously. We elucidate on this trade-off and determine 
optimal policies for the human operator. We show the optimal 
policy requires the human to drop some tasks. We present a 
receding horizon optimization strategy, and compare it with the 
greedy policy. 

I. Introduction 

The advent of cheap camera sensors, has prompted the 
extensive deployment of camera sensor networks for surveil- 
lance [5], [9]. Typically, the feeds of these cameras are send 
to a central location, where a human operator looks at these 
feeds and decides on the existence of certain features. The 
correctness of the decision on a particular feed depends on 
the time-duration the human operator allocates to it. Thus, for 
a busy human operator, the optimal time-duration allocation 
is critically important. 

Recently, there has been significant interest in understand- 
ing the physics of human decision making [3]. Several 
mathematical models for human decision making have been 
proposed [3], [13]. These models suggest that the correctness 
of the decision of a human operator in a binary decision 
making scenario evolves as a sigmoidal function of the time- 
duration allocated for decision. Thus, the probability of the 
correct decision by a human operator remains small till a 
critical time, and then jumps to a larger value. When a human 
operator has to serve a queue of decision making tasks, then 
the tasks (e.g., feeds from camera) waiting in the queue lose 
value continuously. This trade-off between the correctness 
of the decision and the loss in the value of the pending 
tasks is of critical importance for the performance of the 
human operator. In this paper, we address this trade-off, and 
determine optimal duration allocation policies for the human 
operator serving a decision making queue. Alternatively, 
we determine the task release rate that yields the desired 
accuracy for each task. 
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There has been a significant interest in the study of the per- 
formance of a human operator serving a queue. Schmidt [18] 
model the human as a server and numerically study a queue- 
ing model to determine the performance of a human air traffic 
controller. Maximally stabilizing task release policies for a 
human in the loop queue has been studied by Savla et al [14], 
[15], [16]. Bertuccelli et al [1] and Savla et al [17] study the 
human supervisory control for the unmanned aerial vehicle 
operations. Donohue et al [8] study an optimal duration 
allocation problem is a time constrained static queue. 

The optimal control of queueing systems [19] is a classical 
problem in queueing theory. Stidham et al [12] study the 
optimal service policies for a M/G/1 queue. They formulate 
a semi-Markov decision process, and describe the qualitative 
features of the solution. Certain assumptions in [12] are 
relaxed by George et al [10]. Hernadez-Lerma et al [11] 
study an optimal adaptive service rate control policy for a 
M/G/1 queue with an unknown arrival rate. 

We study the problem of optimal time-duration allocation 
in a queue of binary decision making tasks with a human 
operator. We refer to such queues as decision making queues. 
We consider three particular problems. First, we consider a 
time constrained static queue, where the human operator has 
to perform a given number of tasks with in a prescribed 
time. Second, we consider a static queue with latency penalty. 
Here, the human operator has to perform a given number 
of tasks. The operator incurs a penalty due to the delay in 
processing of each task. This penalty can be thought of as the 
loss in the value of the task over time. Last, we consider a 
dynamic queue of the decision making tasks. The tasks arrive 
at a fixed rate and the operator incurs a penalty for the delay 
in processing of each task. In all the three problems, there 
is a trade-off between the reward obtained by processing 
a task, and the penalty incurred due to resulting delay in 
processing other tasks. We address this particular trade-off. 
Major contributions of this work are: 

i) We determine a closed form optimal duration allo- 
cation policy for the time constrained static decision 
making queue and the static decision making queue 
with penalty. 

ii) We provide a simple procedure to determine an optimal 
duration allocation policy for the dynamic decision 
making queue. 

iii) We rigorously establish that the optimal duration allo- 
cation policy may drop some tasks, i.e, not process the 
tasks at all. 



The remainder of the paper is organized in the following 
way. We discuss some preliminary concepts in Section 11. 
We present the problem setup in Section III. We present the 
optimal allocation policy for time constrained static queue in 
Section IV. The static queue with latency penalty is consid- 
ered in Section V. We present the optimal allocation policy 
for the dynamic queue with latency penalty in Section VI. We 
elucidate on these problems further through some examples 
in Section VII. Our conclusions are presented in Section VIII. 

II. Preliminaries 

A. Speed-accuracy trade-off in human decision making 

Consider the scenario where the human has to decide on one 
of the two alternatives Hq and Hi, based on the collected 
evidence. The evolution of the probability of correct decision 
has been studied in cognitive psychology literature [2], [13], 
[3]. 

Pew's model: The probability of deciding on hypothesis 
Hi, given that hypothesis Hi is true, at a given time 
t G M>o is given by 

where po ^ [0, 1], a, 6 G M are some parameters which 
depend on the human operator [2], [13]. 

Drift diffusion model: Conditioned on the hypothesis Hi, 
the evolution of the evidence for decision is modeled as 
a drift-diffusion process [3]. Given a drift rate /3 > 0, 
and diffusion rate a, with a decision threshold r], the 
conditional probability of the correct decision is 

P(say Hi\Hi,t) = . _ / e 2.^t dA, 

where A = J\f{Pt^ cr'^t) is the evidence at time t. 




(a) Pew's model (b) Drift diffusion model 



Fig. 1. The sigmoidal evolution of the probabilities of correct decision 
B. Sigmoidal functions 

A smooth function / : R>o M>o defined by 

f{t) = U.{t)I{t < tM) + fcn.{t)I{t > W), 

where /cvx and fcm are monotonically increasing convex 
and concave functions, respectively, and is the inflection 
point. Derivative of sigmoidal function is unimodal with 
maximum at t^. Further, f'{0) > and lim^^oo f\t) = 0. 
Also, limt^oo /''(^) = 0. A typical graph of the first and 
second derivative of a sigmoidal function is shown in Fig- 
ure 2. Note that the evolution of the conditional probabilities 
of correct decision are sigmoidal functions in Pew's as well 
as drift-diffusion model. 




(a) (b) 



Fig. 2. (a) First derivative of the sigmoidal function and penalty rate. A 
particular value of the derivative may be attained at two different times. 
The total benefit decreases till tmin. increases from tmin to tmax, and 
then decreases again, (b) Second derivative of the sigmoidal function. A 
particular positive value of the second derivative may be attained at two 
different times. 

C. Receding horizon optimization 

Consider the following infinite horizon dynamic optimization 
problem. 

CO 

maximize ^^^tlj{x{i),u{i)) 

£=1 

subject to x{i + 1) = (l){x{i), u{i)), 

where x{i)^u{£) G M are the state and control input at time 
^ G N, respectively, : R x R ^ R is the stage cost, and 
^ : R X R ^ R defines the nonlinear evolution of the system. 

The receding horizon optimization [6] approximates the 
optimization problem (1) as the following finite horizon 
optimization problem at each stage e N: 

e-\-N 

maximize ^ tlj{x{i),u{i)) 

subject to x{i + 1) = (l){x{i), u{i)), 

where A/^ G N is a finite horizon length. The receding horizon 
optimization is summarized as following 

Algorithm 1 Receding horizon optimization 
1: at time eN, observe state x{0) 

2: Solve optimal control problem (2) and compute the 

optimal control inputs ^i*(6>), . . . , ^x*(6> + T) 
3: Apply u*{0), and set i9 = 6> + 1 
4: Go to Step 1 : 



III. Problem setup 

We consider the problem of optimal time duration allocation 
for a human operator. The decision making tasks arrive at 
a given rate and are processed by a human operator (see 
Figure 3.) The human receives a unit reward for the correct 
decision, while there is no penalty for a wrong decision. For 
a decision made at time t, the expected reward is 

E[lsayi/,|//„t] =nsay Hi\Hi,t) = f{t), (3) 

where / : R>o ^ [0, 1] is a sigmoidal function. 

We consider three particular problems. First, we consider a 
time constrained static queue, i.e., the human operator has 
to perform N ^ N identical decision making tasks with in 



time T G M>o. Second, we consider a static queue with 
penalty, i.e., the human operator has to perform N e N 
identical decision making tasks, but each task lose value 
at a constant rate per unit delay in its processing. Last, 
we consider a dynamic queue of identical decision making 
tasks where each task lose value at a constant rate per unit 
delay in its processing. For such a decision making queue, 
we are interested in the optimal time-duration allocation to 
each task. Alternatively, we are interested in the task release 
rate that will result in the desired accuracy for each task. 
We intend to design a decision support system that tells the 
human operator the optimal time-duration allocation for each 
task. 
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Fig. 3. Problem setup. The decision making tasks arrive at a rate A. These 
tasks are served by a human operator with sigmoidal performance. Each 
task loses value while waiting in the queue. 



IV. Time constrained static queue 
A. Problem description 

Consider that the human operator has to perform N e N 
identical tasks with in time T G M>o. Let the human 
operator allocates duration t£ to the task £ G {1,...,A^}. The 
following optimization problem encapsulates the objective of 
the human operator: 



N 



maximize 
t 



N 



subject to y^ tj <T 



(4) 



£=1 
t ^ 0, 



where t = {ti, . . . , ^at} is the duration allocation vector. 
B. Optimal policy 

We define the Lagrangian L : R^q x ^>o x R^q ^ R for 
the constrained optimization problem (4) by 



N 



N 



We also define the dual cost function : R>o x R>q 
by 

h{a^ /J.) = max a, /L^). 



(5) 



The dual problem to the primal optimization problem (4) is 
minimize h{a^ ^) 
subject to a > 0, ^ 0. 



(6) 



Lemma 1 (Constraint qualification): For the primal prob- 
lem (4) and the dual problem (6), the constraint qualification 
holds. 

Proof: Let t* be an optimal solution to the primal 
problem (4). Without loss of generality assume that > 0, 
i.e., > is not an active constraint. The gradients of the 
other constraints are e^, j G {2, . . . , A^} and 1 at, where Bj is 
the j^^ standard Cartesian unit vector, and Iat is the A^-tuple 
with all ones. It immediately follows that these gradients are 
linearly independent. Thus, the constraint qualification [7] 
holds. ■ 

Theorem 2 (Time constrained static queue): For the opti- 
mization problem (4), the optimal duration allocation vector 
t* is an A/'-tuple with m* entries equal to T/m* and all other 
entries zero, where 

m* = argmax mf{T/m). 

me{l,...,N} 

Proof: We apply the Karush-Kuhn-Tucker necessary 
conditions [4] for an optimal solution t*^a*^ii*. 
Linear dependence of gradients 



0, 



V£g{1,...,A/'}. (7) 



Feasibility of the solution 



T - i^r > 0. 



t* h 0. 

Complementarity conditions 

a*(T-l^t*) = 0. 

/i,%*=0, V£g{1,...,A/'}. 

Non-negativity of the multipliers 

a* > 0, fi* h 0. 



(8) 
(9) 

(10) 
(11) 

(12) 



Since / is a strictly increasing function, the constraint (8) 
should be active, and thus, from complementarity condi- 
tion (10) a* > 0. Further, from equation (11), if t} 7^ 0, then 
/i| = 0. Assume that t} ^ 0, then, for each ^ G {1, . . . , N}, 
the equation (7) yields 



nn) 



(13) 



If the equation (13) has no solution, then the Lagrangian 
is decreasing function of and the optimal duration al- 
location is zero. But t| ^ 0, by assumption. Thus, there 
exists a solution of equation (13). Since, the derivative of a 
sigmoidal function is uni-modal, there exists two solutions 
of equation (13). We refer to these values of as t~ and 
with the understanding that t+ > t~ . From Figure 2(a), 



it follows that a local minima exists at t~, while a local 
maxima exists at t+. Further note that the Lagrangian is a 
decoupled function of t^, ^ G {1, . . . , A^}. Thus, the optimal 
t* will contain no t~ entry. Moreover, each entry will be 
either zero or Let m'' < N entries of optimal t* be non- 
zero. Any allocation with m identical non-zero entries, and 
remaining zero entries yields a reward mf{T/m). Therefore, 
the optimal non-zero entries are as stated in the theorem. ■ 

Remark 3 (Notes on concavity I): If the performance func- 
tion / is a concave function, the optimal policy for the time 
constrained queue is to allocate equal time to each task. □ 

V. Static queue with latency penalty 
A. Problem description 

Consider that the human operator has to perform N e N 
identical tasks. Let the human operator assigns time to 
the task £ G {l,...,iV}. The operator receives an expected 
reward /(t^) for an assignment t£ to the task £, while 
she incurs a latency cost c per unit time for the delay in 
processing of each task. The following optimization problem 
encapsulates the objective of the human operator: 



maximize 



1 ^ 




where t = {^i, . . . , tiv} is the duration allocation vector. 
B. Optimal policy 

Theorem 4 (Static queue with latency penalty): For the op- 
timization problem (14), the optimal allocation at stage £ G 
{l,...,iV} is 

t} := argmax(/(/3) - c{N - £+ 
/3e{0,4} 

where 

if f(tinf)<c(7V-^+l), 
max{t G IR>o | f{t) = c{N- £-\- 1)}, otherwise. 

Proof: The proof is similar to the proof of Theorem 2. 
If f\tinf) < c{N — £-\-l), then the objective is a decreasing 
function of ti, and the optimal allocation is zero. Otherwise, 
the local maximum lies at the intersection of penalty rate 
c{N — £ -\- 1) with the decreasing portion of the f (see 
Figure 2(a)). The optimal allocation t| is determined by com- 
paring value of the objective function at the local maxima 
and the boundary. ■ 

Remark 5 (Notes on concavity II): The optimal duration al- 
location for the static queue with latency penalty decreases to 
a critical value tcrit > with increasing penalty rate, then 
jumps down to zero. Instead, if the performance function / 
is concave, then the optimal duration allocation decreases 
continuously to zero with increasing penalty rate. □ 



VI. Dynamic queue with latency penalty 

A. Problem description 

Consider that the human operator has to serve a queue of 
identical decision making tasks. Assume that the tasks arrive 
as a Poisson's process with rate A > 0. We define the 
processing of job ^ G N as the stage £. Let the operator 
assign deterministic time at stage £. The operator receives 
an expected reward /(t^) for a duration allocation ti at stage 
£, while she incurs a latency cost c per unit time for the delay 
in processing each task. We denote the queue length at stage 
£ by Ui. The objective of the operator is to maximize her 
infinite horizon expected reward. The following optimization 
problem encapsulates the objective of the human operator: 

1 ^ c\t^ 
maximize lim v I /(^^) ~ cE[n^]t^ ^ ) 

^=1 (15) 
subject to E[n^+i] = max{0,E[n^] — 1 + Xtg} 

> O.W G N, 

where the term cXtj/2 is the expected penalty due to the 
tasks arriving during stage £. 

B. Optimal policy 

Finite horizon optimization 

We wish to approximately solve the infinite horizon opti- 
mization problem (15) using receding horizon optimization 
as in Algorithm 1. To do so, we pose the following finite 
horizon optimization problem: 



1 ^ cXt^ 
maximize — ^ (/(t^) - cE[n^]t^ - ^) 



(16) 



subject to E[n^+i] = max{0, E[n^] - 1 + At^}, 

where t = {ti, . . . , tiv} is the duration allocation vector. 

Without loss of generality, we assume that the queue length 
is always non-zero. If the queue is empty at some time, then 
the operator will wait till the arrival of new task. There is 
no explicit penalty for the operator to be idle. Under this 
assumption we have 

i-i 

E[n^] =ni -£+l + \^tj. 

i=i 

Some algebraic manipulations show that the objective func- 
tion of the optimization problem (16) is equivalent to the 
function J : MJ^^ M defined by 

where c is the penalty rate, A is the arrival rate, and ni is 
the initial queue length. Thus, the optimization problem (16) 
is equivalent to 

maximize J(t). (17) 



We define : [0, /'(W)] ^ M>o by 

fUy) = max{t G IR>o | f{t) = y}. 

Note that the definition of /^^ is consistent with Figure 2(a). 

For the optimization problem (16), assume that the optimal 
policy allocates a strictly positive time only to the tasks 
in the set T^rocessed ^ {l?---?^}? which we call the set 
of processed tasks. (Accordingly, the policy allocates zero 
time to the tasks in {1, . . . , TV} \ T^rocessed)- Without loss of 
generality, assume 

'Tprocessed {^1? • • • 7 ^m}? 

where r/i < • • • < 77^ and m < N. A duration allocation 
vector t is said to be consistent with T^rocessed if only the 
tasks in T^rocessed ^rc allocated non-zero duration. 

Lemma 6 (Properties of maximum point): For the optimiza- 
tion problem (17), and a set of processed tasks T^rocessed? the 
following statements hold: 

i) A global maximum point t * satisfy t^. > t*^ , for j > 
k, j,k e {l,...,m}. 

ii) A local maximum point consistent with T^rocessed 
satisfies 

m 

f'itlJ = c(ni-%+l)+cA^4,Vfc e {1,. . . ,m}. 

(18) 

iii) The system on equations (18), can be reduced to 

ntl) = V{tl), and tl = fUntl)-c{Vk-Vi)). 



for each k G {2, . . . , m}, where V : 
]RU{+oo} is defined by 



V{t) = 



I +00, 



if f{t)>c{r], 
otherwise. 



where 



p(t) = c(ni-r7i + l + At+A^yi;,(/(t)-c(r7^-7?i))). 

iv) A local maximum point consistent with T^rocessed 
satisfies 

f"{tr)k) ^ cA, for each A: G {1, . . . , m}. 

Proof: We start by proving the first statement. Assume 
t* < t* and define the allocation vector i consistent with 

'13 'Ik 

' processed 

by 



if i e {l,...,m}\{j,A:}, 
if i = /c, 
if i = j- 



It is easy to see that 

Jit*) - J(i) = (r?, 



C)<o- 



This inequality contradicts the assumption that t* is a solu- 
tion of the optimization problem (17). 



To prove the second statement, note that a local maximum 
is achieved at the boundary of the feasible region or at the 
set where the Jacobian of J is zero. At the boundary of the 
feasible region M>q, some of the allocations are zero. Given 
the m non-zero allocations, the Jacobian of the function J 
projected on the space spanned by the non-zero allocations 
must be zero. The expressions in the theorem are obtained 
by setting the Jacobian to zero. 

To prove the third statement, we subtract, the expression in 
equation (18) for A; = j from the expression for A: = 1 to get 



f'K) = r{t,j-cir^,-m)- 



(19) 



There exists a solution of equation (19) if and only if 

fitm) > civj - m). If ntm) < ciVj - Vi) + no), then 
there exists only one solution. Otherwise, there exist two 
solutions. It can be seen that if there exist two solutions 
tf^tj < t'j'^ then tj < tr)^ < tt. From i), only 
possible allocation is . Notice that = f(^y{f'{trj-^) — 
c{rjj —r]i)). This yields feasible time allocation to each task 
Vjd ^ {2, . . . ,m} parametrized by the time allocation to 
the task r]i. A typical allocation is shown in Figure 4(a). 
We further note that the effective penalty rate for the task 
771 is c(ni — r/i + 1) + cA^^-^ t^^.. Using the expression 
of trjj^j G {2, . . . ,m}, parametrized by t^^, we obtain the 
expression for V. 

To prove the last statement, we observe that the Hessian of 
the function J is 

^ = diag(/"(t,J, . . . , /"(V)) - cAl^l^, 

where diag(-) represents a diagonal matrix with the argument 
as diagonal entries. For a local maximum to exist at non- 
zero duration allocations {^r^i , • • • , ^r^^}, the Hessian must 
be negative semidefinite. Thus, a necessary condition for 
Hessian to be negative semidefinite is that diagonal entries 
are non-positive. ■ 

We refer to the function V by the effective penalty rate for 
the first processed task. A typical graph of V is shown in 
Figure 4(b). Given 7^rocessed, a feasible allocation to the task 
r]i is such that f{tr^^) — c{r]j — rji) > 0, for each j G 
{2, . . . ,m}. For a given V^rocessed? we define the minimum 
feasible duration allocated to task rji (see Figure 4(a)) by 



ri:= 



I min{t G M>o | f{t) = c{r]m-m)}. if /'M > c{r]m-m) , 
I 0, otherwise. 



Let /^ax he the maximum value of We now define the 
points at which the function — cA changes its sign (see 
Figure 2(b)): 



^1 := 



So :-- 



mm{t G IR>o I r (t) = cA}, if cA G (0), fl^J, 

0, otherwise, 

'max{t G M>o | f'{t) = cA}, if cA < /^'^ 
0, 



max 5 

Otherwise. 




(a) (b) 



Fig. 4. (a) Feasible allocations to second task parametrized by allocation 
to first task, (b) The penalty rate and the reward rate due to the allocation 
to the first task. 



Remark 8 (Notes on concavity III): With the increasing 
penalty rate as well as the increasing arrival rate, the time 
duration allocation decreases to a critical value tent > ^inf 
and then jumps down to zero, for the dynamic queue with 
latency penalty. Instead, if the performance function / is 
concave, then the duration allocation decreases continuously 
to zero with increasing penalty rate as well as increasing 
arrival rate. □ 



Theorem 7 (Dynamic queue with latency penalty): For the 
optimization problem (17), consider a set of processed tasks 
T^rocessed- The following statements hold: 

i) there exists a local maximum point consistent with 

'^processed? if 



/'(<52) > P(<52); 



(20) 



ii) there exists a local maximum point consistent with 

'^processed? if 

/(ri) < V{n), f{5,) > V{5,), and > n; (21) 

iii) if both conditions (20) and (21) are false, then there 
exist no local maximum point consistent with T^rocessed- 

Proof: A critical allocation to task r]i is located at the 
intersection of the graph of the reward rate function f'(tr^^) 
to the penalty functions V{tn^). From Lemma 6, a necessary 
condition for existence of local maximum at a critical point 
is f'\trj,) < cX . Further, for trj, G ]0, Ji] U [^2, oo[, 
f"{tr]i) — ^^^^ ^h^^ if condition (20) holds, 

then the reward function f\tn^) and the effective penalty 
function V(tr^^) intersect in the region [J2,co[. Similarly, 
condition (21) ensure the intersection of the graph of the 
reward function /'(t^J with the effective penalty function 
V{tr^J in the region ]0, Ji]. ■ 

We now provide a procedure to provide solution to the op- 
timization problem (16). Given a sequence of zero and non- 
zero allocations ^ G {0,+}^, we denote the corresponding 
critical allocation for maximum by t(^). The details of the 
procedure are shown in Algorithm 2. 

Algorithm 2 Optimal allocation for decision making queue 
1: given ni, TV, c, A 
2: k :=0; A := (/>; 
3: for each string ^ e {0, 

4: set T^rocessed := {i G {1, . . . , TV} | = +} 

5: if condition (20) or (21) 

6: then determine critical allocations 

for maximum t}^^ via bisection algorithm 
7: determine allocations tl^.^j G {2, . . . , m} 

8: determine expected queue lengths 

EH,£g{1,...,7V} 
9: if EH > 0,W G {1,...,A^} 

10: then ^ = ^U{f^(0} 

11: optimal allocation t* = argmax^^^ J(t) 



VII. Numerical examples 

We now elucidate on the optimal policies for the three prob- 
lems through some numerical examples. We consider three 
examples. In the first and second example, we demonstrate 
application of Theorem 2 and 4, respectively. In the third 
example, the Algorithm 2 is utilized in a receding horizon 
fashion. 

Example 9: If the human operator has to serve N = 10 tasks 
in time T = 30 sees, and the human receives an expected 
reward f{t) = l/(l+exp(5— t)) for an allocation of duration 
t sees to a task, then the optimal policy for the human is to 
drop six tasks, and allocate 7.5 sees to any four tasks. An 
optimal allocation is shown in Figure 5(a). □ 

Example 10: If the human operator has to serve = 10 
tasks and the human receives an expected reward f{t) = 
1/(1 + exp(5 — t)) for an allocation of duration t sees to a 
task, while she incurs a penalty c = 0.02 per sec for each 
pending task, then the optimal policy for the human is shown 
in Figure 5(b) . Note that the optimal duration allocation 
increases with decreasing number of pending tasks. □ 



I 3 4 
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(a) Time constrained static queue (b) Static queue wt. latency penalty 
Fig. 5. Optimal allocations 

Example 11: If the human operator has to serve a queue of 
tasks with Poisson arrival at rate A = 0.5 per sec and the 
human receives an expected reward f{t) = l/(l+exp(5— t)) 
for an allocation of duration t sees to a task, while she 
incurs a penalty c = 0.01 per sec for each task in queue. 
We solved an optimization problem with horizon length 
A' = 10 at each stage to determine the receding horizon 
optimization solution. A receding horizon policy for the 
expected evolution of the queue, at different arrival rates, 
is shown in Figure 6. A receding horizon duration allocation 
policy for a sample evolution of the queue, at different arrival 
rates, is shown in Figure 7. The duration allocations for 
a greedy policy, i.e., an optimization with horizon length 
N = 1 at each stage, are shown in Figure 8. The optimal 
expected benefit J{t) for the optimal and the greedy policy is 
shown in Figure 9. It can be seen that the maximum benefit is 
obtained at an arrival rate at which one expects only one task 
in the queue at each time. As expected, the performance of 



the greedy policy and the optimal policy is almost the same 
at this arrival rate. □ 
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Fig. 6. Receding horizon policy for expected evolution of the dynamic 
queue with latency penalty. An optimization problem with horizon length 
= 10 is solved at each stage. 



(a) Low arrival rate 
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Fig. 8. Greedy policy for the expected evolution of the dynamic queue 
with latency penalty. An optimization problem with horizon length N = 1 
is solved at each stage. 




Fig. 9. Expected benefit over a finite horizon for the receding horizon and 
greedy policies 



VIII. Conclusions 

We presented an analysis on the decision making queues. 
Three particular problems were discussed. First, a time 
constrained decision making queue with no arrival was con- 
sidered. We showed that the optimal policy may drop some 
tasks and assign equal time to the remaining tasks. Second, a 
decision making queue with no arrival and a latency penalty 
was considered. It was observed that the optimal policy may 
still drop some tasks. Further, the duration allocation to the 
tasks increased with the decreasing queue length. Last, a 
decision making queue with latency penalty was considered. 
A receding horizon policy was developed to determine the 
optimal duration allocation. It was observed that the optimal 
policy may drop some tasks. 

The decision support system designed in this paper assumes 
that the arrival rate of the tasks as well as the parameters 
in the performance function are known. An interesting open 
problem is to estimate the parameters of the performance 
function, and come up with policies which perform an online 
estimation of the arrival rate, and accordingly determine the 
optimal allocation policy. 



(c) High arrival rate 

Fig. 7. Receding horizon policy for a sample evolution of the dynamic 
queue with latency penalty. An optimization problem with horizon length 
= 10 is solved at each stage. 



Remark 12 (Optimal arrival rate): It can be seen that the 
total expected benefit is maximum when there is always only 
one task in the queue. If there is more than one task in the 
queue, then the operator is incurring more penalty for the 
same reward. Thus, the optimal arrival is the one at which 
a task arrival is expected at the time when the previous task 
loses all its value, i.e., at r* = max{t G M>o | f'{t) = 
2c}. In general, there would be performance goals for the 
operator, and higher task arrival rate for the queue could 
be designed. Such a problem can be analyzed by putting a 
saturation on the sigmoidal performance function, and thus 
obtaining a new sigmoidal performance function. □ 
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